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ABSTRACT 

We use measurements of the projected galaxy correlation function Wp(rp) and galaxy void statistics to test 
whether the galaxy content of halos of fixed mass is systematically different in low density environments. We 
present new measurements of the void probability function (VPF) and underdensity probability function (UPF) 
from Data Release Four of the Sloan Digital Sky Survey (SDSS), as well as new measurements of the VPF from 
the full data release of the Two-Degree Field Galaxy Redshift Survey. We compare these measurements to pre- 
dictions calculated from models of the Halo Occupation Distribution (HOD) that are constrained to match both 
the projected correlation function Wp(rp) and the space density of galaxies ilg. The standard implementation of 
the HOD assumes that galaxy occupation depends on halo mass only, and is independent of local environment. 
For luminosity-defined samples, we find that the standard HOD prediction is a good match to the observations, 
and the data exclude models in which galaxy formation efficiency is reduced in low-density environments. For 
Li samples we cannot rule out a slight increase in galaxy formation efficiency at low densities. More remark- 
ably, we find that the void statistics of red and blue galaxies (at L ^ QAL^) are perfectly predicted by standard 
HOD models matched to the correlation function of these samples, ruling out "assembly bias" models in which 
galaxy color is correlated with large-scale environment at fixed halo mass. We conclude that the luminosity 
and color of field galaxies are determined predominantly by the mass of the halo in which they reside and have 
little direct dependence on the environment in which the host halo formed. In broader terms, our results show 
that the sizes and emptiness of voids found in the distribution of L > 0.2L* galaxies are in excellent agreement 
with the predictions of a standard cosmological model with a simple connection between galaxies and dark 
matter halos. 

Subject headings: cosmology:theory — galaxies:halos — large scale structure of the universe 
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1. INTRODUCTION 

The Halo Occupation Distribution (HOD) has become 
one of the primary methods for analyzing and interpret- 
ing g a laxy clustering measurements ( e .g., jK auffmann ( 
m% IJing et al.1 [T998[ iBenson et al.i l2000t .Seli; 
Peacock & Smi th] BOod [M a & Frvi 120001: IScoccimarro et al I 
200 It iBerlind & Weinberg! 120021) ." The unique and powerful 
aspect of the halo occupation approach is that it quantifies the 
bias of a class of galaxies with respect to the underlying dark 
matter distribution through the statistical relationship between 
galaxies and the dark matter halos in which they reside. In the 
HOD formalism, the bias of a galaxy sample is specified by 
the quantity P(N\M), the probability that a halo of mass M 
contains galaxies. Along with assumptions about the spa- 
tial and velocity biases of galaxies with respect to the dark 
matter within their host halos, P(N\M) describes the bias of 
the sample on all scales and for any clustering measure. The 
implicit assumption of this approach is that P{N\M) depends 
only the mass of the halo and is independent of the halo's 
larger-scale environment. This "standard implementation" of 
the HOD has been called into question by a number of recent 
theoretical results. Thus it is important to test the underlying 
assumptions of the HOD and quantify any residuals of the 
standard implementation, reducing systematic uncertainties 
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in the cosmological constraints derived from HOD modeling. 
In turn these tests lead to insight on the processes of galaxy 
formation. In this paper we use new measurements of void 
probability statistics in the Sloan Digital Sky Survey (SDSS, 
l^rk et al. 2000) and Two-Degree Field Galaxy Redshift Sur- 
vey (2dFGRS. IColless et allfeOOll 12003) to test whether the 
relation between the properties of field galaxies and their host 
halos depends on mass only. We define field galaxies as iso- 
lated systems residing in low density regions of the galaxy 
distribution. In the halo occupation context, these are galax- 
ies that live at the center of halos at or near the minimum halo 
mas s scale for the given galaxy class. 

In lTinker et al.l (l2006l) (hereafter. Paper I), we demonstrated 
that the statistics of galaxy voids are a sensitive diagnostic for 
environmental dependence of halo occupation. The statistics 
explored in Paper I were the void probability function (VPF, 
denoted Pq), and the underdensity probability function (UPF, 
denoted fy). The VPF is defined as the probability that a 
sphere of radius r contains zero galaxies of a given type. The 
UPF is defined as the probability that a sphere of radius r has 
a galaxy density less than some fraction of the overall mean 
density for that galaxy type. Here we set that fraction to the 
conventional value of 0.2. Previous theoretical studies sought 
to determine what information, if any, void statistics alone 
offer for constraining galaxy bias jLittle & W einberg 19941 
lBensonll200Tt iBerUnd & Weinb erg 2002). Paper I explored 
void statistics in conjunction with other clustering measures, 
demonstrating that standard HOD models that match observa- 
tions of the projected two-point correlation function Wp(rp), 
and the number density of galaxies fig, predict nearly degen- 
erate void statistics regardless of the mapping between halo 
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mass and central galaxy luminosity or the amp litude of dark 
matter clustering, conclusions similar to those of lConroy et alJ 
(l2005h . The remarkable robustness of void statistics (under 
the assumptions of the standard HOD) implies that they can 
be used to test these underlying assumptions. The two-point 
correlation function is dominated by galaxies in mean and 
high density regions of the universe. If one uses this statis- 
tic to constrain galaxy occupation and correctly predicts an- 
other clustering measure that probes underdense regions, then 
one infers that halo occupation at fixed mass does not change 
between high and low densities. 

Early studies concluded that the properties of dark mat- 
ter halos, such as their formation times and merger histories, 
were independent of, or weakly dependent on envir onment 
dLemson & Kauffmannl l 19991 ISheth & Tormenll2004h . More 
recent results, aided by higher resolution and larger-volume 
simulations, detect a cl ear relation between formation times 
and local environme n t dGao et alj 12005 1 : iHar ker et al.' '2005"; 
Wechsler et all 120061 iZhu et all I2006t lOao & White 2006; 



Wetzel et al.l 120071) . These studies conclude that this corre- 



lation is strongest for low-mass halos, with a sign such that 
older halos form in higher density regions. Attempts to mea- 
sure this effect in high mass halos observationally have me t 
with conflicting results dYang et al.l2006l:lBerlind et al.l2006h . 
Although the correlation between halo formation time and en- 
vironment is now firml y establishe d, the effect on the galaxy 
population is less clear. ICroton et a l. (2006b, 2007) use their 
semi-analytical galaxy formation model to quantify this "as- 
sembly bias" in the galaxy population. ICroton et al.l (l2007h 
quantify assembly bias by its effect on the large-scale galaxy 
two-point correlation function, b^ = \/£,/^o, where is the 
clustering amplitude of a model once the assembly bias has 
been removed from the sample by scrambling galaxies among 
halos of the same mass. For luminosity-defined samples, they 
find b^-l ^ 0.05 for faint galaxies, decreasing to -0.05 for 
bright galaxies. The effect is strongest in their model for faint, 
red, central galaxies, increasing the amplitude of the correla- 
tion function of these objects by nearly a factor of 4. Be- 
cause central galaxies define the voids (in the statistical sense 
of the VPF and UPF), our approach is well-suited to test- 
ing this effect in the true galaxy distribution. Observational 
tes ts of the environment al dependence of galaxy properties 
by iBlanton et al.l (l2006bl) have shown that the blue fraction 
correlates with the galaxy density on small-scales (i.e., the 
scale of a large halo), but not with the larger-scale density field 
(see also Blanton et al. 2006a). Abbas & Sheth (2005, 2006) 
use the halo occupation formalism to calculate galaxy clus- 
tering as a function of local galaxy density, concluding that 
the standard P(N\M) approach correctly predicts the cluster- 
ing of SDSS galaxies as a function of their local environment. 
ISkibba et al. (2006) use the standard HOD approach to accu- 
rately predict the luminosity-weighted correlation function of 
SDSS galaxies. Our use of void statistics is complementary 
to these tests, in that voids probe the most extreme galaxy en- 
vironments. While the papers above are sensitive to assembly 
bias of satellite galaxies or galaxies in mean and high-density 
environments, void statistics are affected by the bias of a small 
subset of the overall galaxy sample, making them more sensi- 
tive to assembly bias in central galaxies and at low densities. 

In this paper we present new measurements of the VPF 
and UP F from Data Release Four of the SDSS (DR4, 
lAdelman-McCarthy et al.ll2006 ). Through the use of a larger 
observational sample, this work extends earlier analysis of 



void statistics from t he CfA redsh i ft sur v ev ([Vog elev et al.l 
[1994] ) die 2dFGRS (ICroton et al.1 120041 iHovle & Vogele^ 
12004"; 'Patiri et al."2006') and Data Release Two of the SDSS 
(Conrov et al. 2005). We also present new measurements of 
the VPF from the full data release of the 2dFGRS that are 
better suited to the purposes of this study than earlier analy- 
ses. We compare these data to predictions for the VPF and 
UPF created with the standard implementation of the HOD 
and for models in which the occupation of central galaxies de- 
pends on environment. All models are constrained to match 
Wp{rp) and ilg. Using the parameterization of Paper I, we cre- 
ate density-dependent models in which the minimum mass 
scale for hosting a central galaxy shifts by a factor f^i„ in 
environments where the density falls below a threshold value 
Sc- A value of > 1 physically represents a model in which 
galaxy formation become less efficient in low-density regions, 
creating positive assembly bias {b^ > 1). Models with f^nin < 1 
imply an increase in galaxy formation efficiency, in the sense 
that a given mass halo can host a more luminous galaxy, yield- 
ing negative assembly bias (b^ < 1). We show that the void 
statistics for faint galaxies, Mr -5 log /i < -19, are accurately 
predicted by the standard HOD, while models with density de- 
pendence always produce a worse fit to the observational data. 
The void statistics for bright galaxies, M, -5logh < -21, are 
adequately fit by the standard HOD prediction, while models 
with positive assembly bias are strongly exclu ded. (For ref- 
erence , the characteristic luminosity L^, in the iBlanton et alJ 
(l2003h r-band luminosity function is Mr-Slogh = -20.44.) 
We also make predictions for void statistics in the 2dFGRS. 
We find once again that the standard HOD accurately predicts 
the VPF in these samples, leaving little room for assembly 
bias. 

We also explore models for faint color-defined galaxy sam- 
ples. The dependence of galaxy color and morphology on 
local environment is well established (e.g. iDressleii 119801 ; 
Postman & Geller 1984). The correlation between color 
and environment has bee n refined with the increased statis- 
tics of the SDSS, (Blanton et all l2005at iPark et all 12007,) . 
iBerlind et all (l2005h use cosmological hydrodynamic simula- 
tions to demonstrate that these variations of color with en- 
vironment can be explained by the variations of the halo 
mass function with environment only, without variations of 
halo occupation at fix ed mass. The observational results of 
IBlanton et al.l (l2006b[) support this concl usion. However, the 
theoretical results of ICroton et al.l (l2007l) imply that environ- 
mental effects of halo occupation should be strong for color- 
defined samples. In their semi-analytic model, faint red cen- 
tral galaxies preferentially occupy halos in dense regions (at 
fixed halo mass). This is contrary to the standard HOD as- 
sumption that the central galaxy of a halo has a probability of 
being red that is independent of environment. We show that 
the measured VPFs are well-fit by the standard HOD predic- 
tions for these samples. An assembly bias as str ong as that in 
the semi-analytic model of ICroton et alJ (l2007h would likely 
be detectable within the given errors of our VPF measure- 
ments. 

Section 2 presents the details of our measurements of the 
VPF and UPF from the SDSS, and our methodology for mak- 
ing predictions for these statistics from the HOD. Section 3 
presents the results for luminosity-defined samples from the 
SDSS, comparing observational measurements to HOD pre- 
dictions using both the standard implementation and models 
with density dependence. In §4 we show results for color- 
selected galaxy samples from the SDSS, and compare to HOD 
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models. In §5 we present results for luminosity-defined sam- 
ples from the 2dFGRS. In §6 we discuss these results. 

2. SDSS DATA AND MODELING 

2.1. Observational Samples and Measurements 

For SPSS galaxie s, we use measurements of Wp{rp) from 
IZehavi et al.l (l2005h (hereafter Z05). These measurements 
were performed on volume-limited samples from a spectro- 
scopic sample of nearly 200,000 galaxies, from an angular 
survey area of 2497 deg^, appro ximately the size of Da ta Re- 
lease Two of the SDSS (DR2, 'Abazajian et al.' 2004). We 
use four volume-limited samples defined by r-band magni- 
tude thresholds M,-51og/i = -19, -20, -21, and -22. For 
all samples, we utilize the full covariance error matrix of the 
measurements when comparing HOD models of Wp(rp) to ob- 
servations. To measure the void statistics in DR4 of the SDSS 
we use the NYU Value Added Galaxy Catalog (Blanton et al.| 
l2005b). This sample is larger in volume than the Z05 sample; 
the survey area for DR4 is 4783 deg^, but the flanking fields 
and other isolated patches are not well suited for our measure- 
ments, and are not used. As in the Z05 samples, all galaxies 
are A:-correct ed to redshift z = 0.1 using the software package 
kcorrect jBlanton & RowiesI 120061) Although the larger 
volume of DR4 might lead to differences in Wp(rp), the sam- 
ples for which Wp(rp) have been measured in DR4 are within 
the errors of the Z05 data (I. Zehavi, private communication). 
As we will demonstrate in §2.3, the measurements of Wp{rp) 
used in this paper are sufficient to constrain the HOD for the 
Mr -5 log h < -19 and -21 samples, so that the uncertainties 
in HOD parameters are nearly negligible in comparison to the 
measurement errors on the VPF and UPF. When analyzing the 
data we use the full error covariance matrix, also taken from 
Z05. 

To measure the UPF and VPF from the survey at a given 
r, we randomly place a large number of spheres with radius r 
within the survey, counting the number of galaxies located in 
each sphere. We limit the number of spheres to a maximum 
of 10^ and minimum of 10^, numbers that have been tested 
to ensure convergence. The largest number of spheres is used 
at small radii to reduce the shot noise in the measurement at 
those scales. Once the counts in each cell are determined, the 
VPF is defined as the fraction of empty spheres, i.e.. 



where A]v refers to the number of spheres that contain 
galaxies, and AAtot indicates the total number of spheres. The 
UPF is defined as the fraction of spheres that contain less than 
20% of the expected number of galaxies from the mean den- 
sity, 

Nu(r) 

Puir) = KlJ2^N (2) 

A'=0 

where Nu(r) = f loor(0.2 x ng4TTr^/3). While Po(r) rapidly 
approaches zero at radii larger than the mean galaxy sepa- 
ration, Pu{f) falls off approximately as an exponential func- 
tion and is less subject to shot noise at larger r. Thus is it 
possible to measure Pu(f) more accurately at larger scales 
than Po{r). Paper I also demonstrated that these statistics 
have somewhat complementary information when testing for 
density-dependence in {N)m', altering galaxy formation effi- 



ciency may eliminate galaxies from low-density regions with- 
out entirely emptying them. 

Our handling of the SDSS survey geometry an d complete- 
ness closely parallels that of IConrov et al.l (l2005h . The com- 
pleteness, defined as the ratio of successfully attained red- 
shifts to targetable objects, varies non-trivially from to 1 
as a function of right ascension and declination. Sophisti- 
cated software has been developed to efficiently handle com- 
plex survey geometry such as the SDSS. In order to identify 
and avoid regi ons of low completeness we use the Mangle 
package (Hamilton & Tegmark 20^4) to generate a densely 
sampled angular window function. This window function in- 
corporates regions of the sky not surveyed, either because the 
region lies outside the bounds of the survey or because of 
bright foreground stars, and incompleteness within the survey 
due either to fiber collisions (no two fibers can be separated 
by less than 55 arcseconds, affecting 7% of targetable red- 
shifts) or objects that could not be assigned a reliable redshift, 
affecting ~ 1% of targetable objects. 

In order to treat edge-effects arising from measuring the 
VPF and UPF via counts-in-spheres, we convolve the win- 
dow function with a circular smoothing kernel of angular ra- 
dius 6{r,z) proportional to a sphere projected onto the plane 
of the sky with comoving radius r at redshift z- This con- 
volution yields the total completeness of the survey at each 
point in the sky for a given angular sphere size, where the in- 
completeness could arise from either a sphere lying partially 
off the edge of the survey or being in a region of the survey 
with low spectroscopic completeness. We then place random 
spheres only at points above a minimum convolved complete- 
ness. This allows us to robustly avoid regions of bright stars, 
regions of low completeness (due, for example, to inclement 
weather during observations) and the edges of the survey. The 
distribution of completenesses is approximately a Gaussian 
centered at ^ 88% with an additional constant component ex- 
tending to low completeness. Motivated by this distribution, 
we place spheres only in regions above a minimum convolved 
completeness of 83%, noting that our results are insensitive to 
this exact value. Spheres are placed uniformly along the line 
of sight because each sample is volume limited. 

In the above methodology, completeness issues are han- 
dled by including only those regions of the survey which are 
both high and uniform in completeness and then incorporating 
the remaining small incompleteness effects into model predic- 
tions (which we will discus s below). An alternative method- 
ology has been proposed bv lCroton et al.l ( |2004|) . in which in- 
completeness effects are treated by correcting the measured 
VPF in order to recover the "true" underlying VPF of the 
galaxy distribution. This particular correction scheme counts 
the number of galaxies within a sphere of radius / = rjf^l'^ as 
contributing to the VPF at radius r i f is the convolved com- 
pleteness at r). This scheme in essence treats incompleteness 
as missed volume rather than missed galaxies. Although this 
correction is exact in the Poisson limit, it will over-correct the 
VPF to some degree at larger r or lower Po(r). The system- 
atic error accrued is difficult to estimate without the use of 
detailed mock catalogs, reducing the usefulness of the correc- 
tion method in the first place. Thus to compare our models 
to data, we modify the theoretical predictions to match the in- 
completeness of the survey, rather than trying to remove the 
incompleteness from the survey itself. We will discuss this 
further in §2.3. 

As in Z05, we create two separate volume-limited samples 
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Fig. 1 . — Projected correlation function data and HOD fits for tlie Mr— 5 log ft < -19 sample (panels [a] and [c], respectively) and the Mr-51og/i < —20 sample 
(panels [b] and [d], respectively). In the top panels, points with error bars are the SDSS data of Z05, while the gray region represents the range in HOD fits with 
^Xwp < ' v/ith respect to the best-fit HOD model. Bottom panels plot the mean occupation functions {N)m for 20 randomly chosen HOD fits with Axf^,^^ < 1. 
Results in panels (b) and (d) are for the M,- — 5 log h < —20 sample restricted to z < 0.06. 



TABLE 1 

Properties of the SDSS Volume Limited Catalogs 



Sample 


<min 






/comp 


Volume [(r'Mpc)3] 


-19 


0.02 


0.06 


1.19 X 10-^ 


0.873 


1.78 X lO*" 


-20 


0.02 


0.06 


4.33 X 10^^ 


0.873 


1.78 X 10* 


-20' 


0.02 


0.10 


4.93 X 10-3 


0.873 


8.28 X 10'^ 


-21 


0.03 


0.15 


1.01 X 10'^ 


0.876 


2.11 X 10' 


-22 


0.05 


0.22 


5.77 X 10"^ 


0.876 


8.15 X 10' 


red 


0.02 


0.06 


3.28 X 10"^"' 


0.873 


1.78 X 10'' 


blue 


0.02 


0.06 


4.33 X 10-3 


0.873 


1.78 X 10'' 



Note. — Number densities are given in units of (/?"' Mpc)^ . /comp is the mean completeness of each 
sample. —20' refers to the unrestricted sample that includes the Sloan Great Wall. See §2.2 for discussion. 
V is the volume of each sample. Samples are defined with luminosity thresholds, but the red and blue 
samples are restricted to the magnitude range -19 < M, - 5 log A < -20. 



with Mr - 5 log /i < -20. The maximum redshift for these ob- 
jects is z = 0.10, but this redshift l imit i ncludes the so-called 
'Sloan Great Wal l' supercluster jGott e t al. 2005; see also 
iBaugh et al.ll2004l for results from the 2dFGRS). This struc- 



ture dominates the overall clustering of the full -20 sample, 
and the presence of such a structure makes it difficult to accu- 
rately estimate the true cosmic variance for this sample. We 
follow Z05 in focusing on a sample restricted to the same 
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Fig. 2. — Projected correlation function data and HOD fits for the Mr < —21 sample (panels [a] and [c], respectively) and the M,- < —22 sample (panels [b] and 
[d], respectively). In the top panels, points with error bars are the SDSS data of Z05, while the gray region represents the range in HOD fits with Axl, < 1 with 
respect to the best-fit HOD model. Bottom panels plot the mean occupation functions {N)m for 20 randomly chosen HOD fits with Ax^, ^ < 1. 



redshift limit as the Mr- 5logh < -19 sample of z < 0.06. 
Unless otherwise stated, all results for these galaxies use the 
restricted redshift sample. 

2.2. HOD Modeling 

We constrain the occupation function by fitting the ob- 
served Wp(rp) and fig fo r each sample with th e analytic model 
for Wp(ro) described in [ Tinker et al.] (l2005h (see also IZhengI 
l2004t IZehavi"etaDl2004r The mean occupation function is 
divided into two terms; central galaxies located at the center 
of mass of the halo, and satellite galaxies distributed through- 
out the halo. For SDSS samples defined by a luminosity 
threshold, the central occupation function takes the form 



1+erf 



logM-logMrr 



(3) 



where M,nin is a cutoff mass scale and all logarithms are base- 
10. Formally, in equation (|3]l M^i„ is the mass at which 
{Ncsn)M = 0.5. The parameter (TiogM describes the shape of the 
central galaxy cutoff. Physically, this parameter represents 
the scatter between halo mass and central galaxy luminosity; 



if this scatter is large then a fraction of low-mass halos will be 
included in the sample and the shape of the cutoff will be soft. 
If this scatter is small then central galaxies follow a nearly 
one-to-one mapping of mass to luminosity, and (A^cen)M re- 
sembles a step function. 

The satellite galaxy occupation function is modeled as a 
truncated power law. 



(4) 



where Mcut is a cutoff mass scale for satellites, Msat is the 
amplitude of the power law, and ctsat is its slope. In equa- 
tion (|4|i the mass at which halos host on average one satel- 
lite is Ml = Mcut + Msat- The total occupation function is 
Wm = {Ncen)M+ (A^sat)*?- As cxprcsscd in equations Q and 
the occupation function has five free parameters. In prac- 
tice, the number of free parameters is reduced to four because 
Mmin is set by fig once the other parameters have been cho- 
sen. One can accurately fit Wp(rp) with o nly a three-parameter 
occupation function (e.g.. IZehavi et aljr2004. ,2005) . but we 
allow {N)m extra freedom to explore how variations in the 
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shape of {N)m alter the predicted void statistics. In Paper I 
we demonstrated that the void statistics are relatively insensi- 
tive to fJiogM and Mmin allowed by Wp(rp) and fig, but to quan- 
tify the uncertainty in our predicted void statistics we leave all 
parameters free. For each Wp(rp), the best-fit model is found 
by minimization using the full covariance error matrix of 
the data. To minimize we use the Monte Carlo Markov 
chain method (MCMC). While less efficient than other tech- 
niques, MCMC quantifies the errors on the HOD parameters. 
For each sample, we randomly select twenty HOD fits from 
the MCMC chain that have a Ax' < 1 with respect to the best- 
fit model. These 20 fits will be used to estimate the range in 
HOD predictions for the void statistics allowed by the Wp{rp) 
data. The best-fit models for each sample are listed in Table 
2. 

Figure [T] presents the results of the HOD analysis of the 
M, -5 log/i < -19 and -20 samples. Figures[TH and[TJ) plot the 
data with diagonal error bars, along with the sample of twenty 
HOD fits from the MCMC chain. Figures [TJ; and[TJl present 
the occupation functions for each of those twenty fits for faint 
and bright samples, respectively. For the M^-Slog/i < -19 
sample, the twenty projected correlation functions calculated 
from the HOD fits are nearly indistinguishable. But the oc- 
cupation functions in [T]; differ substantially at low masses. 
Because M^i„ for this sample is significantly below the non- 
linear mass scale = 8.60 x 10 /i~'M0 for this cosmol- 
ogy, Wp(rp) is relatively unaffected by softer or harder cen- 
tral cutoff^s; the mean bias of the HOD is largely unaffected 
by variations in <J\ogM- In Paper I we demonstrated that the 
distribution of voids is also unaffected by such changes to 
the occupation function, yielding degenerate VPFs and UPFs. 
Figure [TJl presents the twenty occupation function for the 
Mr -5logh < -20 sample. For this sample, the shape of the 
central galaxy cutoff is essentially unconstrained; the range in 
ciogM from the twenty MCMC models is 1.4 to 0.05. Because 
the volume of this sample is the same as the Mr -5 log/; < -19 
sample, the differences in the constraints are somewhat sur- 
prising. The size of the diagonal errors on Wp{rp) are similar, 
but the data points for the brighter galaxies are more corre- 
lated, reducing the constraining power for this sample. 

Figure |2] shows the same quantities as the previous fig- 
ure, but now for the M^-Slog/i < -21 sample, and the 
M, -5logh < -22 sample. Figure [T]; presents the twenty oc- 
cupation functions for the M^-Slog/i < -21 sample. For this 
sample, M^in ^ M*, thus the constraints on criogM from Wp{rp) 
alone are substantially stronger than for the fainter samples. 
For the brightest galaxies. Figure ^ shows large differences 
in one-halo clustering among acceptable models, resulting in 
significant differences in (A^sat)M in Figure |2}l. The lack of 
strong constraints on the HOD prevent the use of this sample 
and the M,.-51og/7 < -20 sample for testing assembly bias in 
the void statistics. But, as we will show in the following sec- 
tion, for these the constraints on the HOD can be enhanced 
moderately through the addition of the VPF and UPF. 

2.3. Mock Catalogs 

Once the best-fit HOD model is identified, we predict void 
statistics by populating the halos identified in dark matter N- 
body simulations. Central galaxies are located at the center of 
mass of the halo, and satellite galaxies are placed randomly 
throughout the halo such that they follow the density profile of 
iNavarro et al. ( 1997) with a conce ntration parameter given by 
the model of Bullock et al.l (|2001|) . Central galaxies are given 
the velocity of the halo center of mass, while satellite galax- 



ies are given an additional random velocity in each Cartesian 
direction drawn from a Gaussian distribution with dispersion 
equal to the virial velocity of the halo a^^^. = GM/27?vh , where 
we have defined /?vii to be the radius at which the mean inte- 
rior density of the halo is 200 times the background density. 
All calculations of the VPF and UPF are performed in red- 
shift space using the distant observer approximation, with the 
z-axis as the line of sight. Our results are insensitive to the 
value of J7,„ or possible velocity bias of the galaxies within 
reasonable limits (i.e., variations less than ^ 40%). Although 
these parameters alter the redshift space positions of galaxies, 
the net effect on the void statistics is negligible. As with the 
observational measurements, we calculate the VPF and UPF 
using 10^- 10^ random spheres at each radius. Errors on the 
calculations are estimated by jackknife sampUng of the simu- 
lation volume into 125 subsamples. 

We use two simulations to calculate void statistics, a 
smaller box 400 Mpc on a side and a larger box 
1086 /!"'Mpc on a side. Both simulations are inflation- 
ary cold dark matter models with identical cosmologies. 
The linear matter power spectrum used to create the ini- 
tial co nditions of each simulation w as calculated with CMB- 
FAST (ISeljak & Zaldarriagal 1 19961) with the parameter set 
(rj„, erg , f^i, n., /z) = (0.3 , 0.9, 0.04, 1 .0, 0.7). For fainter galax- 
ies we utilize the smaller simulation to make predictions. This 
is the same simulation used in Paper I, consisting of 1280"^ 
particles, yielding a particle mass of 2.54 X 10'^ /i-'M0. To 
model the brighter galaxies we populate the larger simulation. 
This simulation contains 1024^ particles, yielding a particle 
mass of 9.95 x 10"^ h~^MQ. For both simulations, the ini- 
tial conditions are in tegrated with the hashed oct-tree code 
of IWarren & Salmon! (Il993l) . with Plummer force softening 
lengths of 14.6 /z"' kpc and 40 kpc for the small and large 
boxes, respectively. Halos are identified in the simulations 
using the friends-of-friends algorithm with a linking param- 
eter of 0.2 times the mean interparticle separation, a value 
that selects halos roughly correspon ding to our adopte d defi- 
nition of a virial overdensity of 200 (iDavis et al.lll985h . To be 
self-consistent, all analytic calculations are performed with 
the same set of cosmological parameters listed above. For 
these calculations, th e halo mas s f unctio n is obtained with 
the fitting function of I Jenkins et al] (1200 For the halo bia s 
function, we use the fitting function from lTinker et al.l (l2005l) . 
This bias r e lation utilizes the functional form presented in 
ISheth et al] (1200 ll) . but with parameter values (a = 0.707, 
b = 0.35 and c = 0.8) calibrated on a set of larger-volume N- 
body simulation with widely varying cosmologies. 

As mentioned in §2.1, we modify the number density of 
galaxies in each mock to match that measured in each obser- 
vational sample. At each radius the mean number density of 
SDSS galaxies within spheres figir) is calculated. The maxi- 
mum deviation of n^(r) from the overall mean averaged over 
all radial bins in less than ^ 2% for each luminosity sam- 
ple, demonstrating that our treatment of the survey mask is 
robust and that we are probing the same volume with each 
sphere size. The mean number densities for each sample are 
listed in Table 1, along with other details of each sample. 
Due to incompleteness, these number densities are less than 
that expected f rom the measuremen t of the r-band luminos- 
ity function by i Blanton et alj (l2003h . When calculating the 
VPF in our mock galaxy distributions, we dilute the mocks to 
match the number densities of the data at each radius. Using 
the overall mean density produces minimal differences in the 
theoretical predictions, with small differences at the lowest-r 
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points where the VPF is Poisson dominated. 

The galaxy number density required by the HOD analysis 
of Wp(rp) is the true number density, which must be estimated 
from observational samples and has an error associated with 
it due to cosmic variance. As noted in A bazajian et alJ (l2005h . 
this error is < 5% for the M,.-51og/i < -21 sample. To test 
the sensitivity of our model predictions to errors in the true 
number density, we alter ilg by +/- 10% and re-fit Wp(rp). The 
resulting void statistics, once matched to the sample number 
densities, are within the errors on the theoretical estimates. 
Due to the steepness of the halo mass function, increasing 
or decreasing fig by 10% alters the mass scales of the HOD 
parameters (Mmin, M[, Mcut) by ~ 5% (with the opposite sign 
of the change in fig), but the overall shape of the HOD is nearly 
unchanged. We conclude that cosmic variance errors on the 
true galaxy number density do not bias our results. 

2.4. Error Estimation and Systematics 

We estimate the errors on the measured void statistics with 
our simulations, described the previous section. The volume 
of the M, -51og/i < -19 and -20 samples is approximately 
equal to a cube 120 /;"' Mpc per side, 1 /37 the volume of the 
400 /i"' Mpc box. Once the HOD predictions have been cal- 
culated, using the best-fit HOD from the Wp(rp) fitting, the 
box is divided into 27 cubic subregions, each 133 /i~'Mpc 
per side. The dispersion among the subregions is scaled by 
(133/120)-'/^ = 1.17 to correct for the fact that the subregions 
do not exactly match the volume of the observational sam- 
ple. This scaled dispersion is taken to be the error on the 
observational quantity. We also estimate the covariance ma- 
trix from this method. This method is more robust than esti- 
mating the errors directly from the observational sample due 
to variations of the galaxy number density on 120 /i~'Mpc 
scales. When estimated directly from the data, these fluctu- 
ations cause the errors to be underestimated with respect to 
the dispersion amongst the simulation subregions. At large 
scales, r ~ 10 Mpc in the fainter samples, proper error es- 
timation from the data is also inhibited by small sample vol- 
ume. To estimate errors for the brighter samples, the same 
process is followed using the 1086 /!"'Mpc box. For the 
Mr -5 log /i < -21 sample, this larger simulation is approxi- 
mately 47 times larger For Mr- Slog h < -22 galaxies, the 
volume of the large simulation is equivalent to 16 times the 
observational sample. When calculating for a given model, 
we neglect the innermost (r = 1 Mpc) data point. In tests 
we find that the errors on this scale require a prohibitive num- 
ber of random spheres to converge, and including this point in 
the covariance matrix introduces significant noise to the error 
estimate. The data at this scale contain little useful informa- 
tion anyway; the behavior of the VPF i s nearly Poisson a t 
r < 1 Mpc for all luminosity samples dCroton et al.ll2004h . 

For clarity, we will refer to values with respect to Po(r), 
Puir), and Wpirp) as Xvpf' Xupf' and xl^, respectively. 

3. RESULTS FOR LUMINOSITY-DEFINED SDSS SAMPLES 
3.1. Observational Results and HOD Predictions 

Our approach is to take a random sample of 20 HOD mod- 
els that all produce accurate fits to the Wp{rp) data, and for 
each model calculate the VPF and UPF All HOD models are 
^xi',, < 1 with respect to the best-fitting model. As shown in 
Table 1, the best-fit models all yield xl, Iv 1 . Conrov et al.l 
(l2b05.) and Paper I concluded that P^ir) contained little ad- 
ditional information about the galaxy distribution relative to 



the two-point correlation function. If this is exactly true, and 
the precision of the measurements of the different statistics 
are equal, we would expect that 1) all 20 models will produce 
good fits to the void statistics, and 2) that the range in Xvpf 
will be approximately 1, just as with the distribution of Xh.^, 
values. If the void statistics do contain complementary in- 
formation about the galaxy distribution, one or both of these 
expectations will be violated. An alternate method would be 
to perform a joint fit to Wp(rp), Po{r), and Pu{r) simultane- 
ously, and then compare the constraints on HOD parameters 
to the analysis in which Wp(rp) is considered alone. Because 
calculating the void statistics involves the use of an N-body 
simulation this procedure is time intensive. It also requires an 
estimate of the covariance between all three data sets, which 
is not available. This method is more rigorous than the one 
we employ, but our approach provides a straightforward test 
of the HOD models, and discrepancies between predictions 
and measurements are readily detectable and quantifiable. 

Figure |3] plots the measured SDSS VPF for the four lumi- 
nosity samples in Table 1 and compares them to the best-fit 
models using the standard implementation of the HOD. In 
each panel, the points with error bars represent the observed 
SDSS values. Lines show the VPF obtained from the popu- 
lated simulations. The lower panels in each quadrant present 
the residuals of the model from the data. We define the resid- 
ual as AP/asDSS = (Po^^-P!^^^)/(rsDSS, where asoss is the 
diagonal error bar on the SDSS data. We divide by the error to 
more clearly present the differences between theory and ob- 
servations; the fractional error on the VPF (and the UPF) can 
range from lO"-' at small radii to 1 at large r. The data are 
highly correlated, so a AP/ctsdss ~ 1 for several consecutive 
data points is still only a ~ Icr deviation overall. The errors 
on the lines are the jackknife error bars, quantifying the the- 
oretical uncertainty in our predicted VPF for the HOD model 
that best fits Wp(rp), resulting from the finite volume of the 
simulation. The shaded region in the lower window of each 
panel represents the range in Po{r) from the MCMC models, 
quantifying the uncertainty in the theoretical prediction asso- 
ciated with the uncertainty in the HOD parameters. We now 
describe each sample in detail. 

Figure[3t presents the results for the M,.-5 log/z < -19 sam- 
ple. Due to the low luminosity threshold, this sample has 
the smallest volume and largest observational errors on both 
Wp(rp) and the void statistics. It also has the highest number 
density, driving the VPF to zero at the smallest value of r of all 
four samples. The agreement between the measured VPF and 
that predicted by the best-fit HOD, which assumes no den- 
sity dependence to {N)m, is excellent. The residuals are ap- 
proximately O.ScTsDSS or less at all r. The Xvpf fo'" '^^e best-fit 
model is 8.99 for 10 data points (note that "best-fit" here refers 
to Wp{rp), and no parameters are adjusted to match the VPF it- 
self). Due to the errors on Wp(rp), the range in predicted Po(r) 
from the set of acceptable HOD fits is larger than the jackknife 
errors on Pair) for an individual model, but AP/ctsdss ^ 1 for 
all HOD models with Wp(rp) fits of ^xl-,, < 1- The Xvpf 
ues for these models range from 8.27 to 1 1.7. We attribute the 
larger range of Xvpf of 3-4 mostly to the increased volume of 
the DR4 sample relative to the Wp{rp) sample. 

Figure [3}' presents the results for Mr -5 log /i < -20 galax- 
ies. The points with error bars in the upper panel of Figure|3]5 
show the results from the restricted sample. The best-fit HOD 
prediction is ^ 2cr low at r > 4 /;"' Mpc, yielding Xvpf - ^^-^ 
for 10 data points. The range in Xvpf from the twenty MCMC 
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Fig. 3. — Compai'ison of the measured SDSS VPF to HOD predictions from fitting Tlie luminosity sample is labeled in each panel. The results for each 

sample are presented in two panels; the upper panel presents the SDSS Poir) and the HOD prediction, while the lower panel plots difference between the data 
and prediction, relative the the errorbar on the data. The errors on the HOD prediction are calculated from the simulation by the jackknife method. The shaded 
region in the lower panel represents the range in predictions from a sample of HODs with AxJ^ < 1 with respect to the best-fit model. The data and model in 
the Mr < —20 panels are using the restricted volume-limited sample. The yellow shaded region plots the results from using the full sample, z < 0.10. 



models is 11. 4 to 118. This is in sharp contrast to the results 
in |3^, in which a set of Wp{rp) models with Ax^^,^ < 1 pro- 
duces a set of VPFs with Axvpf ^ 4. This is due to the large 
range in (T\o„m allowed by the Wp(rp) data. Although the VPF 
is most sensitive to the fraction of galaxies that are satellites, 
large variations in the central occupation function still influ- 
ence the size of voids to some degree (Paper 1, Figure 6). The 
value of XvPF correlates with the CTiogM such that sharper cen- 
tral cutoffs yield more accurate predictions for Pair), with a 
correlation coefficient r = 0.94. Combined with the fact that 
the central cutoff shape is ill-constrained by Wp(rp) alone, the 
VPF adds significant information for constraining the HOD 
for this sample; models with criogM < 0.3 yield Xvpf S 12. 
The yellow shaded region presents the residuals for VPF pre- 
dictions for the same analysis as the orange shaded region, but 
now using the full z < 0.10 volume. The larger volume and 
smaller Wp{rp) errors tighten the constraints on the HOD, but 
as noted in Z05 the presence of the Sloan Great Wall makes it 
difficult to find an HOD model that accurately fits the am- 



plitude of the correlation function in the two-halo regime. 
The supercluster boosts the large-scale power in the two- 
point clustering, and dramatically alters the three-poin t clus- 
tering (Nichol et al. 2006; Baugh et al. 2004; Gaztanaga et all 
120051) . This amplification of clustering creates larger voids 
in the galaxy distribution, producing residuals with respect to 
the HOD predictions that are significantly negative. 

Figure |3}; presents the results for the Mr - 5 log h < -2 1 
sample. Although this sample includes the Sloan Great Wall, 
the volume of this sample is large enough such that the in- 
clusion of this structure does not significantly alter the clus- 
tering statistics. Recall that for this sample (and for the 
M,.-51og/! < -22 sample), we use the 1086 /;"' Mpc box to 
calculate the HOD predictions and estimate the observational 
errors. The residuals of the best-fit model are AP/ctsdss ^ 1 
for r < 10 /i"' Mpc, but at larger scales the residuals gradually 
increase to the point where the residuals between the best-fit 
model and data are ^ 2(Tsdss are r > 12 /i"'Mpc. The Xvpf 
for the best-fit model prediction is 27.1 for 19 data points. 
The range of Xvpf values for the MCMC sample of models 
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Fig. 4. — Comparison of the measured SDSS UPF to HOD predictions from fitting Wp(rp). Jumps in the predictions and data occur when the number of 
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Mr < —20 panels have an additional yellow shaded region comparing the Po(r) from the z < 0.10 sample with the HOD constraints from Wp{rp) for the same 
sample. 



is XvPF = 22. 1 to XvpF = 29.2, indicating that Po(r) adds some 
complementary information to Wp{rp) for constraining the oc- 
cupation function, assuming that the HOD is environment in- 
dependent. The value of Xvpf negatively correlated with 
ciogM, but the correlation is much weaker than in Figure [3J), 
with a correlation coefficient r = -0.59. For this model, a joint 
fit to both Wp{rp) and Po{r) would most likely find a solution 
with a combined x^/j^ < 1. We will discuss this further in the 
following section. 

Figure [3}J presents the results for Mr -5 log /i < -22 galax- 
ies. This sample has the largest volume, but galaxies above 
this magnitude threshold are rare. Thus Poisson fluctuations 
contribute substantially to the jackknife errors on Wp(rp) at 
smaller scales, and the Z05 Wp{rp) for this sample has no pairs 
at rp < 1 Mpc. The lack of information on clustering in the 
one-halo regime decreases the constraints that can be placed 
on the HOD. The best-fit HOD model is in good agreement 
with the observations, with Xvpf - 29 for 29 data points. 
The range of Xvpf for the MCMC models is large, extending 



from 23.4 to 52.0. The shape of the central cutoff for these 
models varies from aiogM = 0.5 to CTiogM = 0.8. The halos that 
contain galaxies in this magnitude regime lie in the exponen- 
tial cutoff of the mass function, where the halo bias is a strong 
function of mass. Models with higher values of a\ogM have on 
average lower Xvpf values with respect to the VPF, yielding 
r = -0.74. Thus for samples of objects with limited clustering 
information at small scales, extra constraining power can be 
obtained through void statistics. 

Figures 2^-0}; present the UPF results for the same four lu- 
minosity samples. In each figure the upper panel shows the 
measured UPF for SDSS galaxies and the best-fit HOD pre- 
diction. As in Figure |3] the lower panels plot the residuals 
between data and best-fit model, as well as the range in pre- 
dictions from the 20 MCMC models. The comparison of this 
statistic to the HOD predictions are similar to those of the 
VPF. In Figure |4^, the Xupf '^^e best-fit model model is 
9.93 for 11 data points, with a range of 7.17 to 11.7 for the 
MCMC models. In Figure|4]3, the best-fit HOD model is once 
again ^ 2-0- below the observations at r > 4 /i"' Mpc, yield- 
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ing XuPF - 69-8 for 1 1 data points. The range in Xvpf values 
is 10.7 to 93.9, with Xupf correlating with the value of uiogM 
as with the models in Figure[3]5. In Figure|4};, the HOD pre- 
dictions are in better agreement with the data at large scales 
than the VPF results from[3};, yielding Xupf = 20.7 for 19 data 
points. The range of Xupf values from the MCMC models is 
smaller than the VPF results, with maximum and minimum 
values of 22.0 and 16.5, respectively. 

For M,- - 5 log /; < -22 galaxies, the UPF contains little new 
information with respect to the VPF. The number density of 
this sample is low enough that a single galaxy in a sphere 
is enough to make the local density larger than the thresh- 
old of 0.2ng for all r < 27 /i"' Mpc. Therefore we com- 
pare predictions and measurements for the probabil- 
ity that a random sphere has exactly one galaxy within it. 
For r < 16 /i"'Mpc this statistic probes overdense regions 
(Sg > 0). At large and small scales, the agreement between the 
data and best-fit model are excellent. At intermediate scales, 
8 < r < 1 8 Mpc, the agreement is adequate but the range of 
predictions is wide, resulting from the lack of tight constraints 
on the HOD from Wp{rp) alone. 

With the exception of the M^-Slog/i < -20 sample, void 
statistics do not provide a significant amount of new infor- 
mation about HOD parameters, but for each sample they do 
tighten the constraints on the shape of the central galaxy cut- 
off parameter, uiogM, relative to Wp(rp) alone. 

3.2. Comparison to Density-Dependent Models 

In Paper 1 we presented a simple model for density- 
dependent occupation functions that focuses on changes to the 
minimum mass scale for central galaxies. The parameters of 
this model are Sc, the threshold density below which the HOD 
changes, and /min, the factor by which M„im changes in these 
low density regions. We calculate the local density of each 
halo with a top-hat smoothing filter with radius 5 /i"' Mpc. For 
example, if = -0.5 and /min = 2, halos in regions that are be- 
low 50% of the cosmic mean density must be twice as massive 
(relative to halos in denser regions) in order to host a galaxies 
above the luminosity threshold. A value of /^in = oo corre- 
sponds to complete suppression of galaxies in regions below 
Sc- We can plac e these mod els in the context of assembly bias 
as defined by Croton et al.l ([2007) by calculating the ratio of 
the large-scale correlation function in the density-dependent 
model to its standard counterpart, i.e., b^ = y^^/£,o- We find 
that bx > 5% for Sc > -0.5 and > 2 for models of the 
Mr-5logh < -19 sample. For the brighter sample, bji>5% 
for models with 6c > -0.1. We demonstrated in §3.1 that the 
Mr -5logh < -19 and -21 void statistics are already well-fit 
by the standard HOD. Thus adding two new parameters will 
not statistically improve the model. But we explore density 
dependent models in order to constrain the level of assembly 
bias for central galaxies: to what degree can /min differ from 
unity (the standard HOD assumption) and still adequately fit 
the void statistics? 

When creating density-dependent HOD models, we follow 
the procedure outlined in Paper 1: the number density of a 
sample is held constant, so if Mmin in low-density areas in- 
creases (/n,in > 1), the overall M^in must decrease (slightly) 
to compensate for the missing low-density galaxies. 

Figure |5^ presents three models with = 4 and Sc = 
-0.6, -0.4, and -0.2. Only - 8% of halos with mass M = 
10" M0 reside in regions with 6 < -0.6 (see Paper 1, Fig- 
ure 7), but the effect on the void statistics can be seen in the 



lower panel of Figure [5^, which shows the residuals of the 
predicted VPF to the data for this model. At r > 5 /;"' Mpc, 
AP/(TsDSS = 1, and the discrepancy monotonically increases 
with increasing r. For Sc = -0.4 and -0.2, the effect can be 
seen clearly in the upper panel, with residuals that are larger 
than the scale of the lower panel. Figure |5}; plots Xvpf ^ 
function of Sc for = 2 and 4. The gray shaded regions 
shows the range of Xvpf values from the 20 MCMC models. 
For /mil, = 2, there is no change to the VPF at Sc = -0.9, but 
as the threshold density increases, Xvpf rapidly increases, go- 
ing from Xvpf = lO-'^ at Sc = -0.8 to Xvpf = 32.5 at Sc = -0.7. 
As noted in Paper 1, the effect of increasing 6c 'saturates' at 
Sc « -0.4, yielding a maximum Xvpf of around 300. For 
/mill = 4, x^ rapidly increases at 6c > -0.7 and saturates at 
a value of 1200. The points along each x^ curve indicate 
models that produce b^ = 1 .05 and b^ =1.1 (from right to left). 
Both points lie in the saturation regime, where the discrepan- 
cies with the data are largest. In other words, in this class of 
(fmin,Sc) models, one cannot alter the large scale bias factor 
by 5% without drastically violating constraints from the VPF. 
A model with (fmin^Sc) = (2,-0.75) yields a Ax^ of 10 with 
respect to the standard HOD prediction. 

At low 6c, the overall fraction of galaxies that are "moved" 
from low-density regions to median- and high-density regions 
is too small to affect the overall two-point clustering of the 
sample. As this fraction becomes non-negligible, the ampli- 
tude of the two-halo term increases as the mean bias of the 
sample is altered. Statistically, however, void statistics are 
far more sensitive to these changes in the galaxy distribu- 
tion. For /nin = 2, 6c = -0.5, the Ax^^ relative to the stan- 

dai-d HOD is only 2, while Axvpf = 210. Note also that the 
values of xt-,, are dependent on the value of o-g assumed in 
the model. We have adopted a value of erg = 0.9 to match 
that of the simulation. A lower value of erg, consistent with 
new results from co smic microwave background anisotropics 
('Sper gel et al.l |2006'), could compensate for the increased am- 
plitude of the two-halo term in Wp(rp) for high-(5c models. For 
the void statistics, no such degeneracy with o-g exists; in Paper 
1 we showed that models with (7g = 0.9 and 0.7 yielded nearly 
identical void statistics, even though the lower value of (Tg 
produced a poor fit to the observed Wp(rp). Thus Po{r) is both 
a more robust and more sensitive test to density dependence 
in the central galaxy occupation function. 

FigureslSj) and|5ti compare the measured Pu{r) to the same 
density-dependent models in|5^ and|5};. As 6c and /^in in- 
crease, the underdense regions increase in size and frequency. 
The advantage of the UPF is that the effect of density depen- 
dence does not saturate at high values of 6c; rather, the UPF 
will continue to increase as the threshold density increases. 
Additionally, the UPF is less susceptible to shot noise, and 
the percentage UPF error bars are close to half that of the 
VPF errors. However, this statistic is somewhat less sensitive 
to density dependence than the VPF because it probes mod- 
erately higher densities. While the /min = 2, 6c = -0.5 model 
yields Xvpf °f 220, it yields Xupf ~ 25.1. The UPF is still 
more sensitive to density dependence than Wp(rp) alone. 

For brighter galaxies, the standard HOD prediction for Poir) 
for the Mr -5logh < -21 sample in Figures [3}; and|4j; yields 
voids that are somewhat large compared with the measured 
SDSS statistics. Density-dependent models with /min > 1 
only increase the sizes of voids and make this discrepancy 
more significant. Therefore, we present results for models 
with /min > 1 and /nin < 1 ; in the latter models, halos in un- 
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Fig. 5. — (a) Upper panel: Points with eiTor bars are the SDSS measurements of Po{r) for the M^-Slog/i < -19 sample. Lines are thi'ee density-dependent 
HOD models with = 4 and 5c = -0.6 (red line), -0.4 (green line), and —0.2 (blue line). Bottom panel: The residuals of the model predictions relative to the 
data, (b) Same as (a), but for Puir)- (c) The XvPF model predictions as a function of 5c for models with = 2 (red line), and = 4 (blue line). The 

shaded horizontal band is the range in XvPF fro^n the twenty MCMC models, all using the standard HOD implementation. Red and blue lines represent /„iin = 2 
and /n,in = 4, respectively. The points along each line indicate models that produce an assembly bias (in the correlation function) of 5% and 10%, from right to 
left, (d) Same as (c), but for the UPF. 



derdense regions host (on average) more luminous galaxies at 
fixed halo mass. Figure|6^ compares the VPFs for three differ- 
ent models to the SDSS data: (/min,'5c) = (2,-0.2), (0.5,-0.2) 
and (0.5, +0.2). The model with = 2 is clearly discrepant 
and yields residuals larger than the scale of the lower panel 
for r > 10 Mpc. The two models with /^in = 0.5 appear 
more consistent with the data than the standard HOD model 
in Figure |3};. The residuals are smaller at large scales, but 
these models tend to depress the frequency of small voids be- 
low what is measured in the SDSS. In Figure|6};, we present 
XvpF ^ function of 6c for models with /mm = 0.5,0.75,2 
and 4. The models with /min > 1 produce monotonically 
increasing Xvpf '^i^h increasing 6c, and are always worse 
fits to the data then the standard HOD. Models that produce 
bx > 1.05 yield Xvpf > 100. Negatively biased models with 
/mill < 1 produce a minimum at (/min, <5e) = (0.75,-0.2), yield- 
ing Xvpf ~ 11-2. Figures |6j3 and |6}l present the same re- 
sults for the UPF. As with the VPF, models with f^in < 1 
are in better agreement with Puif), producing a minimum of 



XuPF = 11-8 at (/mm,<5e) = (0.75,-0.2), as compared with the 
minimum Xupf °f 1^.5 from the MCMC models. For this 
model, bx-l= -0.02. Models with b^-l of -0.05, indicated 
with the filled circles in Figures |6}; and |6}l, do not produce 
improved fits to the VPF of UPF. 

As with the fainter samples, altering (A'cen)*? in low density 
regions can alter the two-point clustering to some extent. The 
models with /min > 1 produce better fits to Wp{rp), resulting 
from the increased amplitude of large-scale clustering. In the 
comparison of the best-fit HOD model to the SDSS Wp(r;,) 
data, it can be seen that the model is slightly below the mea- 
sured amplitude in the two-halo regime. Thus redistributing 
galaxies from low to high density areas produces better agree- 
ment with the data. The models with /min < 1 have the op- 
posite effect on the two-point clustering; these models lower 
the bias and increase the discrepancy with the Wpifp) data. 
If we relax our constraints on the standard HOD models by 
setting (JiogM = 0.5, the same result is achieved. This model 
yields x?v., = 10.4, a value similar to that of the best density- 
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51og/i < —21 galaxies. Lines represent three different 
= 0.5, Sc = +0.2 (blue line). Lower panel: The residuals 



Fig. 6. — (a) Upper panel: Points with en'or bars represent the SDSS measurements of Po{r) for Mr - 
density-dependent HOD models, f^^i^ = 2,Sc = —0.2 (red line), f^^i^ = 0.5, 5c = —0.2 (green line), and 
of the model predictions relative to the data. The filled circles represent a standard HOD model with a-[agM = 0-5. (b) Same as (a), but for Pu(r)- (c) The Xvpf °^ 
the HOD predictions for Po(r) as a function of 5c. Red and blue lines represent models with /.^in = 2 and /i„i„ = 4, respectively. Green and cyan lines represent 
models with = 0.75 and = 0.5, respectively. The shaded horizontal line is the range in Xvpf from the twenty MCMC models. The filled circle represents 
the Xvpf model with ctiojm = 0.5 with no density dependence. Colors are the same as for the lines. Points along each line indicate models that produce 

and assembly bias of 5% and 10%, from right to left. For the /j^j,, < 1 models, only the models with a 5% assembly bias are indicated, (d) Same as (c), but for 
the UPF 



dependent model (x^^ = 11.4). The high is a result of 
the lower overall bias of the sample; the lower bias in turn 
produces smaller voids and yields Pair) and Puif) that are as 
accurate as the best density-dependent model. The residuals 
of the (TiogM = 0.5 are shown with the black dots in Figures 
and|6};, and the x^ values for the void statistics are shown 
in |6}l and |6}l. Combining results for all data for this model, 
+ Xvpf + Xupp = 29.4 for 49 data points and 4 free param- 
eters. This summation neglects the covariance between statis- 
tics, but it implies that a joint analysis of all data would easily 
find a set of HOD parameters that accurately fits both Wp(r;,) 
and the void statistics. Thus no strong evidence for /min < 1 
density dependence can be inferred. 

4. COLOR-DEFINED SDSS SAMPLES 
4. 1 . Results for the Standard HOD 



To model the occupation function of color samples, the 
standard HOD parameterization presented in §2 is used to de- 
scribe the overall sample, but (A^cen)*? and {N^^tjia are multi- 
plied by a coefficient that specifies the blue fraction and 
fff-, respectively. The fraction of blue central galaxies is pa- 
rameterized with a lognormal function of the form 



/r(M)=/rexp 



(logipM-logioMmin)' 

2«")2 



(5) 



(cf. Z05, equation 11). Equation (|5]l is the same for satel- 
lite galaxies, with separate parameters for /q'" and af^. This 
adds four free parameters to the HOD model, but in practice 
one of the new parameters is fixed by the overall blue frac- 
tion of galaxies (we choose /q™). We fit Wy,(ry,) for the full 
sample, red sample, and blue sample simultaneously. These 
samples will be correlated, but we only use the covariance 
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Fig. 7. — (a) Open squares with error bars show the measured Wpirp) for -20 < M,--51og/i < -19 galaxies. Red and blue points represent red galaxies and 
blue galaxies. The lines represent the best-fit HOD model to the data, with the same color-coordination, (b) The best-fit occupation functions for the model in 
panel (a). Red and blue lines plot {N)m for red and blue galaxies, respectively. The HOD for the full sample is the sum of these two curves, (c) Open squares 
with error bars show the VPFs for red and blue galaxies. Lines plot the HOD prediction for the VPF from the best-fit {N)m in panel (b). (d) The reduced VPF 
for blue and red galaxies, plotted with blue and red squares, respectively. The solid line represents the negative binomial model, which provides a good fit to the 
blue galaxy VPF, but the HOD prediction (dotted curve) is much more accurate for the red galaxies. 



matrices of each sample independently. Using the full sam- 
ple adds some complementary information to the color-only 
Wp{rp) functions because it contains the cross-correlation of 
the red and blue galaxies. 

To measure the color-dependent VPF from DR4, we adopt 
the same color cut as Z05, g- r = -0.03(Mr-51og/!)-l-0.21. 
The fraction of blue galaxies varies significantly with lumi- 
nosity. Therefore we use galaxies with magnitudes -19 < 
Mf-Slog/z < -20, rather than a sample defined by a lumi- 
nosity threshold, to ensure that the red and blue samples have 
similar mean magnitudes. We choose this sample because we 
wish to analyze the lowest luminos ity sample for whic h ac- 
curate measurements can be made. ICroton et al.l (120071) find 
that the assembly bias of red galaxies monotonically increases 
with decreasing luminosity. The use of a magnitude bin sam- 
ple necessitates an upper cutoff mass for the central occu- 
pation function, representing the mass at which halos begin 
hosting central galaxies too bright to be contained within the 
sample. For simplicity, we adopt a step function cutoff at an 



upper mass limit of 10'^ h~^MQ, obtained from fitting the 
M, -51og/i < -20 sample with a step-function (A^cen)M (i-e., 
ciogM = 0). Like Z05, we also set CTiogM = for the magnitude 
bin sample, effectively making the central occupation func- 
tion a square-window. Although we are fitting the same data 
presented in Z05 (see their Figure 23 and Table 3), we use 
a different linear power spectrum, and Wp{rp) must be re-fit. 
We use minimization to once again determine the best-fit 
model, the parameters of which are listed in Table 2.^ The 
large values of cr"" and ct^^* in Table 2 essentially mean that 
the blue galaxy fraction is a constant as a function of mass. 

Figure |7t shows the results of the HOD modeling of the 
color-dependent clustering. The open squares plot the data 
from Z05 while the solid lines plot the best-fit HOD models. 

Note that the satellite occupation functions in Table 3 of Z05 assume 
a luminosity threshold sample. To obtain (A'sat)M for each magnitude bin, 
('Vsat)M for the next-brighter bin was subtracted off. The parameters of 
('Vsat)M in this paper are for the magnitude bin only and do not require knowl- 
edge of (A'sat)*! of brighter galaxies. 
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Blue and red colors represent blue galaxies, red galaxies, and 
the full sample, respectively. The xl,,, for the full set of 33 
data points is 11.7 (recall however that we have not taken into 
account the covariance between samples). The amplitude of 
clustering increases at all scales when comparing blue and red 
galaxies. The known correlation between galaxy color and 
environment states that red galaxies exist in more dense en- 
vironments, implying that they occupy higher-mass halos that 
are strongly biased. Blue galaxies generally live in the field, 
implying that they are the central galaxies of lower-mass halos 
that are less strongly clustered. The best-fit occupation func- 
tions, shown in Figure [TJ?, bears this out. Blue galaxies dom- 
inate the central occupation function, while satellite galaxies 
are primarily red galaxies. These results are consistent with 
those in Z05. 

Points in Figure |7j; show the measured VPFs for red and 
blue galaxies. The VPF for red galaxies is significantly higher 
than for blue galaxies, nearly 0.5 dex at r = 1 1 Mpc. While 
the number density of the red sample is below that of the blue 
sample, diluting the blue sample randomly to match the red 
number density only increases the VPF at r = 1 1 Mpc by 
0.04 dex; the larger voids in red galaxies are a consequence 
of their stronger clustering. Curves show the VPF predic- 
tions of the HOD model from Figure|7j), in which a ^ 10" 
/i"' Mq halo has a ~ 30% chance of hosting a red galaxy in- 
dependent of its large scale environment. The agreement with 
the measured VPFs is strikingly good, with Xvpf - 9-89 for 
red galaxies and Xvpf - ^-^^ for blue galaxies (with 10 data 
points in each case). We don't perform the MCMC analysis 
for this sample, but we expect the results to be similar to the 
luminosity-defined Mr -5 log /i < -19 sample in Figure |3^. 

Figure [Tji presents the data in the form of the reduced VPF 
(RVPF), in which the quantity x = ~ln(^())/^ is plotted as a 
function of A^^, where N is the mean number of galaxies in a 
sphere of radius r and ^ is the volume-averaged two-point cor- 
relation function (in redshift space). The quantity ^ is related 
to the variance of the distribution of cell counts, yielding 

^ " ^ i, " ■ 

Under the hie r archic al clustering ansatz (see, e.g., 
iBernardeau et al.l l2002h '). all higher-order «-point corre- 
lation functions can be written in terms of powers of the 
two-point correlation function and a scaling coefficient. 
Many different theoretical r nodels have bee n proposed for 
the sc aling coefficients (s e e iFrv et al.1 [T989I) . ICroton et alJ 
(12004 and IConrov et all (l2005h both demonstrated that 
luminosity-defined samples of galaxies exhibit the void 
statistics predicted by a negative binomial model, in which 
the VPF is related to ^ and N by 

PQ{r) = {\+NO-"'^. (7) 

This result led lConrov et alj ( l2005h to conclude that the VPF 
contains no complementary information over the two-point 
correlation function for constraining galaxy bias or halo occu- 
pation.^ The RVPF for blue galaxies in Figure|7}l is consistent 
with the negative binomial model, but for red galaxies the neg- 
ative binomial is not a good description of the data, in agree- 

' It should be noted that IConrov et al.l (2005) use f in redshift space; in 
essence they utilize more information than contained in Wp(rp) alone. When 
analyzing the clustering in mock galaxy samples, those authors found that the 
negative binomial is not a good description of real-space clustering. 



ment w ith the recent results from the 2dFGRS of ICroton et al.l 
( l2006ah . The HOD model, shown with the red dotted line, 
correctly predicts the behavior of the RVPF for this sample. In 
tests we find that occupation functions that produce correla- 
tion functions with large residuals from a power law tend to lie 
away from the negative binomial model in RVPF space. The 
high fraction of satellite galaxies in the red occupation func- 
tion produces the strong transition from the one-halo to two- 
halo regime exhibited by red galaxies in Figure|7t, leading to 
the behavior seen in|7}l. The correlation function for the blue 
sample is very close to a power law and thus is well-described 
by the negative binomial. This trend works in the opposite 
direction as well; HODs that de-emphasize high mass halos, 
such as those with a lower value of a.sat, lie below the nega- 
tive binomial curve, indicating that the negative binomial is 
not universal, but depends on the details of halos occupied by 
a given class of galaxies. 

4.2. Comparison to Density-Dependent Models 

The assembly bias seen in the ICroton et alJ (l2007h semi- 
analytical models is strongest for faint red galaxies, imply- 
ing that low-mass halos that host red galaxies at their centers 
almost exclusively reside near a much larger halo, while in 
low-density environments the probability of encountering a 
red central galaxy is rare. To model this form of assembly 
bias in our HOD models, we adopt a parameterization similar 
to that for the luminosity-defined samples: at a density be- 
low a threshold 5c, the fraction of central galaxies that are red 
changes by a factor /red. 

Figure [8] shows the results for models in which /led = 0, im- 
plying that there are no red galaxies below 5c. Figure[8h com- 
pares the Wpifp) data for the red sample to density-dependent 
models with 5c = -0.4, -0.2, and 0.0. As with the luminosity- 
defined samples, the amplitude of the two-halo term in the 
HOD models increases as red galaxies are removed from low- 
density areas and redistributed in mean- and high-density en- 
vironments. The model with 5c = -0.4, while in reasonable 
agreement with the Wp(rp) data, is clearly discrepant with the 
VPF, yielding Xvpp = 93.2 for 10 data points. Less extreme 
models, /red =1/8 and 1/4, still produce VPF Xvpf values 
of 60.7 and 29.4, respectively, at 6c = -0.4. As 5c increases, 
the discrepancy with both the VPF an d Wp(rp) data get sub- 
stantially larger. iBaldrv et alj (1200 6') have investigated the 
environmental de pendence of the halo occupation for central 
red galaxies in the ICroton et alJ (l2007h model. Although their 
definition of environment is based on nearest-neighbor statis- 
tics, they find that the red fraction in the model decreases by 
nearly a factor of ten around the mean density. In lCroton et alJ 
(2007), the assembly bias for red M, -51og/i = -19 galaxies 
increases the large-scale bias of red central galaxies by a fac- 
tor of 2. For the overall population of red galaxies at this 
magnitude, the assembly bias is ^ 1.25, comparable to the 
increase in Wp(rp) in the /ed = 1 /8, 5c = -0.2 model, which 
yields Xvpf - ^^^8. A more direct comparison is required to 
make precise statements about the form of the assembly bias 
in Croton et al, but the results in Figure [8] only allow for low 
levels of assembly bias for faint color-defined samples. 

5. 2DFGRS RESULTS 

5.1. IdFGRS Wp{rp) Data and Modeling 

The approach we take to apply the HOD to clustering mea- 
surements from the 2dFGRS differs slightly from that used 
above. 2dFGRS measurements are made on luminosity bins 



15 



i 102 ^ 




o 
QU 

O 



r 
-0.5 F 

-1 

-1.5 

-2 h 
-2.5 



I I I I I I I I I I I I I I I I I I I I I 
(b) 3 




<5e=-0.4 
6 =-0.2 



<5„= 0.0 

] I I I I I I I I I I I 



I I I I I I I 







8 10 12 



r [h"' Mpc] 



Fig. 8. — Panel (a): Data and models for Wp(rp) for red galaxies in the —20 < M^ — 51og/i < —19 sample. Points with en'or bars are SDSS data. Lines are 
models in which the central occupation function is set to zero for halos with local densities below -0.4 (black solid line), —0.2 (red solid line), and 0.0 (dashed 
hne). Panel (b): VPFs predicted by t hose same models. Points with enor bars are the SDSS data from Figure|2] In both panels, the red line is a model with 
similar assembly bias as that found in lCroton et alj 420071) for the same luminosity range. 



rather than threshold samples. This necessitates a modified 
form of the central occupation function and a somewhat dif- 
ferent app roach to model fitting. We use the approach de- 
tailed in T inker et al.l (12007.) for modeling these data. We use 
measureme nts of Wp(rp) that hav e be en updated fro r n those 
presented in 'Norber g et al.1 (1200 lb and|Norberg et al.' (20023^ 
to i nclude the full data release of the 2dFGRS (Colless et al] 
HOOJ, an increase from - 160,000 galaxies to - 221,000 
galaxies. The details of the clustering measurements will be 
found in Norberg et al. (in preparation). We present here a 
brief summary of the calculations. Using the full 2dFGRS 
survey we create four volume limited samples, with faint lim- 
its from -51og/i = -18.0 to M^j -51og/i = -21.0, each 
sample 1.0 magnitudes wide. All galaxies brig hter than Mb, = 
-21 are grou ped into a single sample. As in iNorberg et alJ 
( 1200 U l2002ai) . a careful account of the selection function is 
made an d the correlation functions a re obtained usin g the 
standard iLandy & Szalayl (119931) and iHamiltonl (11993 ) esti- 
mators, with typically 100 times more randoms than galax- 
ies. The projected correlation function is estimated by inte- 
grating ^(r^jT^r) out to TTr max = 70 Mpc, providing a sta- 
ble estimate for Wp(rp) out to at least rp = 40 /i"' Mpc. Due 
to the sensitivity of the results on close pair incompleteness, 
we only use data from scales beyond rp > 150 kpc. The 
correlation function is measured in twelve radial bins, spaced 
evenly by 0.2 in logj,, r beginning at logj,, r = -0.7. The errors 
on the clustering measurements are estimated by a bootstrap 
resampling technique on roughly equal sized subregions, of 
which there are 16 in total (8 in each 2dFGRS region, cover- 
ing in each region approx imatively the same survey area; see 
iPorciani & Norberg'2006 for further details). We estimate the 
fuU covariance matrix for each sa mple using 1 00 bootstrap re- 
samplings. The analysis in Tin ker et al.l (l2007l) was performed 
on bins of width 0.5 magnitudes. The bins used here are a full 
magnitude wide, so we redo the analysis on these new data. 

In contrast to HOD models of luminosity threshold sam- 
ples, binned samples require both a minimum and a maximum 
mass scale for central galaxies; as halo mass increases, central 
galaxies become brighter. The central occupation function for 
these samples is denoted {Ncen)^, where / denotes magnitude 
bin. The sum over all (A'ceiOV must be less than or equal to 



unity. Thus we use a modified form of equation (|3]l that sub- 
tracts off brighter galaxies, i.e.. 



, _ 1 

[Ncen/M - 2 



1+erf 



1+erf 



logM-logMJ^ 



logM 



■(A^cen)M , 1 </<3, 



logM-logM^ 



logM 



(8) 



where M'^^^^ is the cutoff mass scale for central galaxies, and 
'^'looM controls the width of the cutoff mass range. In equation 
•O, Mn,in is defined as the mass at which (A^cen)M = 0.5, but in 
equation (O this mass can differ from M'^-^^^. The form we use 
for the satellite galaxy occupation function is 



(Mat)M = exp 



Mi, 



M-ML 



M 



(9) 



Because information about all the bins is required to calcu- 
late (A^)5i^ for any bin / < 4, the best-fit occupation functions 
are determined simultaneously for all four bins. The model 
has 1 3 free parameters, with each M'^^^ once again constrained 
by the number density of galaxies within ea ch bin, calcu- 
lated fr om the 2dFGRS luminosity function (iNorberg et alJ 
(l2002bl) . updated to include the results from the full data re- 
lease). For 48 data points, the best-fit 30.3, yielding 

a per degree of freedom of 0.87. The parameters of the 
best-fit model are listed in Table 3. We make our predictions 
for the VPF by populating the 400 /i"' Mpc box in the same 
manner as for the SDSS samples. 

5.2. VPF Measurements and HOD Predictions 

Measurement s of the VPF for th e 2dFGRS have been pre- 
sented by .Hovle & Vogeleyl (l2004 . ICroton et all (l2004 . and 
iPatiri et all (l2006l) . For the purposes of t his study, none of 
these measurements is entirely adequate. 'Hovle & Vogele^ 
(2004) use the k + e correction of Norberg et al. (2001) in 
their analysis, which is significantly different tha n the latest 
k + e c orrection for 2dFGRS galaxies presented in lCole et alj 
( l2005h used in the Wp(rp) measurements described above. 
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TABLE 2 

Best-fit HOD model parameters for SDSS galaxies 



V QtTinl 
OaliiUiC 


X 1^ 






Qsat 


Mcut 




-19 


4.89/7 


3.76 X 10" 


9.23 X 10'2 


1.11 


4.23 X 10' 


0.158 


-20 


4.77/7 


2.69 X 10'2 


2.46 X 10'3 


1.13 


2.12 X 10'° 


0.915 


-20' 


8.63/7 


9.37 X 10" 


1.39 X 10'3 


1.01 


9.54 X 10'2 


0.084 


-21 


7.48/7 


4.89 X 10'2 


1.05 X 10"* 


1.23 


3.58X 10'2 


0.052 


-22 


0.87/3 


1.17 X lO''' 


4.21 X 10"* 


1.20 


2.40 X 10"* 


0.615 


[-19,-20] 


11.7/28 


3.91 X 10" 


1.32 X 10'3 


1.06 










of- 


rcen 




r.sat 
Jb 




[-19,-20] 




1.91 


0.68 


9.46 


0.33 




Note. 


— All masses are in 


units of Mq. 


The bottom two rows 


are parameters for modeling 


color-selected 


samples. 


See §4 for a discussion. 











TABLE 3 

Best-fit HOD model parameters for 2dFGRS galaxies 



Sample x M' . M\.. 



loglW 



[-18.0,-19.0] 5.7 2.79 x 10" 9.19 xlO'^ 5.07x10" 0.25 
[-19.0,-20.0] 10.0 6.14x10" 1.50x10*^ 1.55 xlO'^ 0.07 
[-20.0,-21.0] 8.8 3.15x10*2 4.23x10*3 1.35x10" 0.23 
<-21.0 5^8 5.21 x 10*3 3.55 x 10"* 1.27 X lO*'* 0.53 

Note. — All masses are in units of /i"* Mq. All samples are analyzed simultaneously, so X"/'^ = 30.3/(48- 
13) = 0.87. 



This leads to a difference in the number densities of galaxies 
between the Wp{rp) samples and the Pair) samples. This dif- 
ference becomes larger with the me an redshift of the sample , 
and the M^j - 5 log /i< -2 1 sample in lHovle & VogelevI ( 120041) 
has more than twice as many galaxies in it as the sample an- 
alyzed here. Note that the effect of this mismatch is quite 
different from the difference in number densities in the SDSS 
samples and the HOD models in §3 . That discrepancy is due 
to the survey selection function, but lHovle & VogeievI ( |2004|) 
essentially analyze different sets of galaxies, which have dif- 
ferent clustering and void statistics. Thus one would not ex- 
pect our HOD predictions to match their Poir) measurements, 
even if we adjuste d our models to match their fig. For the 
ICroton et alJ (l2004l) data, the completeness correction applied 
to them accrues an unquantified systematic error that is diffi- 
cult to model. Their measurements also employ an outdated 
k+e correction from Norberg et al.l (l2002bl), although t he dif- 



ferences between this correction and the ICole et al.l ( 2005h 
correction are substantially smaller Patiri et al. (200q) con- 
struct volume-limited samples within the 2dFGRS with mag- 
nitude thresholds ofMhj-5\ogh < -19.32 andMi^-Slog/i < 
-20.181, values that do not correspond to the unit magnitude 
bins of our Wp(rp) measurements. Thus for comparis on with 
our H OD predictions we repeat the analysis of Pa tiri et alj 
(|2006'), making several adju stments to be tter facilitate the 
comparison. We use the Col e et all ( l2005h k + e correction, 
and all galaxies are corrected to z = 0. 1 . We construct volume- 
limited samples that match our Wp{rp) samples, and we keep 
track of the number density of galaxies at each r in order to 
repeat the procedure used above for comparing to SDSS data. 

We create HOD predictions by populating the 400 /i"' Mpc 
simulation with the best-fit HOD parameters for each magni- 
tude bin and scaling the number density at each r to the value 
measured, as with the SDSS data. Error bars on the data are 
also obtained from this simulation. The mean incompleteness 



of the 2dFGRS is larger than in the SDSS, and the variation 
of fig{r) is also larger. We are unable to use the 1086 /i"' Mpc 
box for modeling the brighter two samples because the oc- 
cupation functions extend below the resolution limit of that 
simulation. For these reasons, we do not perform a detailed 
statistical analysis as with the SDSS samples, but rather com- 
pare the data and models more qualitatively. 

Figure and ^ show the results for the -19 < Mhj - 
51og/i < -18 and -20 < Mhj —Slogh- 19 magnitude bins. 
The best-fit HOD model accurately predicts the the VPF for 
these two samples. Figure |9j; shows the results for -21 < 
Mhj -Slogh- 20. The model slightly over-predicts Po(r) for 
r > 10 /;"' Mpc in much the same way as the M,--5 log/z < -21 
SDSS sample but with smaller significance (with respect to 
diagonal error bars only). For the brightest 2dFGRS galaxies 
in Figure |9}l, our model is a poor fit to the observed VPF. 
The voids in the data are clearly much smaller than those 
predicted by the HOD. As a rough guide, the diagonal-only 
Xdiag ~ f^'" best-fit Wp{rp) model (not shown in this 
Figure). These rare galaxies reside in rare, highly biased ha- 
los and the model predictions are more sensitive to the value 
of (TiogM than for other samples. To explore this effect, we 
analyze this sample separately in an MCMC chain. The up- 
per and lower bounds of the shaded region are models from 
the chain with the lowest and highest values, respectively, of 
ciogM with Axh'^ < 1 with respect to the best-fit model. The 
lower bound, with criogM = 0.9, hes closer to the data but is 
still significantly discrepant. Although it is possible to con- 
struct density dependent models to match the measured P(){r), 
these models will be highly discrepant with the measurements 
of Wp{rp) since they require an increase in the galaxy forma- 
tion efficiency in lower density regions. iBerlind et al.] (l2006h 
investigated the clustering of massive galaxy groups, demon- 
strating that at fixed mass, systems with bluer central galaxies 
are more strongly clustered than redder central galaxies. Be- 
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cause 2dFGRS is a blue-selected survey, the effect of density 
dependence would make the voids in the brightest 2dFGRS 
galaxies larger than in the standard HOD, the opposite of the 
discrepancy in Figure |9}l. 

The conflict in|9}l can be resolved if the brightest 2dFGRS 
galaxies are not always in the most massive halos. The solid 
line in Figure |9}l represents a model for this sample in which 
the maximum value of (A'cen)M for this model is 0.5 rather 
than unity. Equation ([8]l is modified by a simple multiplica- 
tive factor of 0.5, preserving the shape of the cutoff. Phys- 
ically this model implies that the relationship between host 
halo mass and central galaxy luminosity Lc becomes essen- 
tially flat at M > lO''* /!"'Mq in bj. Setting the maximum 
value of (A^cen)M to 0.5 reduces Mmin for this sample in order 
to match the number density and the overall bias of the model 
is reduced. This model is conceptually similar to one with 
a very large value of a\ogM, large enough such that {Nca\)M 
never reaches unity at the largest resolved halos. But due 
to the functional form of equation (|8j values of uiogM large 
enough to resolve the discrepancy with the P\i{r) data place 
a non-negligible fraction of the brightest galaxies in ^ 10** 
M0 halos, which is both physically unreasonable and sig- 
nificantly lowers the amplitude of Wp{rp). The large-scale 
amplitude of the low-(A^cen)M model is also below that of 
the fiducial model, but the increase in y}„ is a modest ^ 2. 
The effect on the void statistics is marked: the low-(A^cen}M 
model VPF is in good agreement with the observations. The 
lower-bound of the shaded region in Figure |9}l (the model 
with (JiogM = 0.9) yields xLg = 91-3, while the low-(A^cen)M 
model yields Xdiag - 35.3. To properly compare these mod- 
els, larger simulations with the proper mass resolution are re- 
quired to estimate the covariance matrices, but it is clear that 
the low-(A'cen}M modcl is an improvement. The clustering of 
the galaxies in the next-brightest bin are unaffected by this 
change in the HOD; although fainter galaxies can occupy the 
highest mass halos, the overall number of these galaxies is 
insignificant, and both Wp{rp) and Pai r) are unchanged. 

For galaxy groups in the 2dFGRS, lYang et al.l (l2005h find 
that Lc increases with halo mass as M^/^ at low masses, but 
becomes shallower for M > lO'^/z"' Mq, increasing as M'/^. 
At M > 1O''*/!"'M0, the mass at which (A^cen)M reaches its 
maximum, the scatter in the L^-M relation becomes large, 
covering nearly half a dex in L^-. For redder bands like Sloan- 
r, a continual monotonic relation between Lc and M is well 
motivated, but just due to the width of the color-magnitude re- 
lation some galaxies from a lower M , magnitude bin will fall 
in the brightest bj bin. Asl Cole et al. (2006) recently pointed 
out, the color distributions of the SDSS and 2dFGRS are sub- 
stantially different, with the SDSS being dominated by red 
galaxies and the 2dFGRS dominated by blue objects. Con- 
volved with the magnitude errors of the 2dFGRS, which are 
larger than those in the SDSS (and will scatter asymmetrically 
from lower luminosities to higher luminosities), a complete 
sample of the brightest M,- galaxies in one survey may not 
contain all of the brightest bj galaxies in the other. 

6. DISCUSSION 

In this paper we have demonstrated that an environmentally 
independent approach to halo occupation can simultaneously 
model both the two-point clustering and the void statistics of 
galaxy samples selected by both luminosity and color. Be- 
cause Wp(rp) and Poir) weight environments differently, with 
Wp(rp) determined predominantly by halos that sit at or above 



the mean density and Pair) determined by halos that reside in 
low-density regions, our results show that, to the limit these 
statistics can be measured, {Ncen)M is independent of environ- 
ment. Although we are not explicitly testing environmental 
dependence of satellite galaxy occupation, the results in the 
paper offer an implicit test: If the number of satellite galax- 
ies strongly correlates with halo environment, then the HOD 
inferred from modeling Wp{rp) will be systematically biased 
and could predict the wrong void distribution regardless of 
whether central galaxies exhibit assembly bias. In Paper I we 
demonstrated that Wp(rp) constrains the fraction of galaxies 
that are satellites, and thus the complementary fraction that 
are central. The central galaxy fraction, in turn, strongly in- 
fluences the distribution of void sizes. If the HOD model for 
Wp(rp) is systematically under- or overestimating this quan- 
tity, then the predicted void statistics will not match observa- 
tions. 

It has been suggested that voids a nd void galaxi es repre- 
sent a challenge to the ACDM model (lPeeblesi2001h . If there 
exists substantial mass in underdense regions, the argument 
goes, then the observed paucity of low-luminosity galaxies 
in these regions is incompatible with the standard hierarchi- 
cal clustering picture because low-m ass halos in the voids 
will contain low-luminosity galaxies. IWechsler et al.l (l2006h 
propose that the assembly bias of low mass halos may be re- 
lated to this so-called 'void phenomenon', because low mass 
halos in underdense regions form later, and the gas within 
them may therefore form stars less efficiently due to an in- 
creased photoionizing background. Our results suggest that 
there is no void phenomenon for galaxies as faint as ~ 0.2L* 
( a halo ma ss o f ^ .02M„, the minimum mass probed in the 
IWechsler eFail]|2006l results). The observed voids in samples 
of low-luminosity galaxies match the voids predicted by the 
typical halos those galaxies occupy. The data presented here 
leave little room for a shift in galaxy formation efficiency 
(or a shift in the typical halo occupied) in low-density re- 
gions. These results are in agreemen t with t he re sults of semi- 
analyt ic models of .Mathis & White! (l2002h and iBenson et akl 
(|2003|) . which find that low luminosity gal axies avoid the 
voids defined by the brighter galaxies. As IWechsler et al.l 
(2006) suggest, assembly bias may influence the formation 
of fainter void galaxies, but larger observational samples are 
required to fully address this pro blem. 

In the semi-analytic results of ICroton et al.l (l2007h . the im- 
pact of the assembly bias on the correlation function ranges 
from b^- 1 = 0.05 for faint galaxies to b^-l= -0.05 for bright 
samples. We have shown that, at least in the class of (fmimSc) 
models considered here, assembly bias of this order for central 
galaxies cannot be reconciled with the measured void statis- 
tics. Density dependent models that produce acceptable fits 
to Po(r) and Pu(f) produce values of \bf- 1 1 < 0.02. At some 
level, assembly bias should be present in the galaxy popula- 
tion, but we have ruled out a strong dependence of (A'cen}M on 
S that could bias measurements of halo occupation parameters 
or constraints on cosmological parameters obtained through 
the application of the HOD. At the level of precision of the 
current generation of large-scale redshift surveys, our results 
suggest that assembly bias is generally not a concern, though 
it could still have some influence on statistical measures not 
constrained (directly or indirectly) by our analysis. The as- 
sembly bias issue will need to be revisited for accurate anal- 
ysis of the next generation of galaxy redshift surveys, when 
percent-level effects become significant. 

For luminosity samples, it is perhaps not unexpected that 
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Fig. 9. — Panels (a)-(c) show a comparison between 2dFGRS VPF data and HOD model predictions. Points with error bai's observational measurements. 
Sohd lines are the HOD predictions from the best-fit model, obtained from the 400 /r' Mpc box. In panel (d), the points show the 2dFGRS data, while the shaded 
region shows the range of predictions for models with the highest and lowest values of cTi„gm that produce a ^Xu',, < 1 with respect to the best-fit model, which 
has a value of a\ogM = 0.53. The lower bound represents a model with cr[agM = 0.9, and the upper bound represents a model with a\agM = 0.05. The solid line is a 
model in which the maximum value of (Aten)*/ is 0.5, as opposed to 1 for the other models. This low-(A'cen)M model has a value of cr\„i,M = 0.73. 



environmental effects are small. For color-defined samples, 
on the other hand, our results are more surprising. Because 
halo formation time depends so strongly on local density, with 
younger low mass halos living in low density regions, one 
might naturally expect the stellar populations within these ha- 
los to reflect this trend. This would lead to an assembly bias 
such that, at fixed mass, the lower the local density, the larger 
the fraction of central galaxies that are blue in color. This is 
exactly the type of bias seen in the models of ICroton et alJ 
(120071) : low mass halos with red central galaxies form near 
z ^ 2, while low mass halos with blue central galaxies have 
formation redshifts of z ^ 1.5 or less. However, in our "stan- 
dard" HOD analysis of color-defined samples, we assume that 
the red central galaxy fraction is independent of environment. 
Equation ^ implies, when applied to the samples explored 
§4, that a 10"-^ H'^Mq halo has a 30% chance of host- 
ing a red cent ral galaxy reg ardless of environment or forma- 
tion time. In iCroton et al.l (120071) . assembly bias results in 
voids in the red galaxies that are significantly larger than in 



the HOD prediction. The results of Figures [T] and [8] support a 
central red fraction that is environment independent. In con- 
trast to luminosity-defined samples, the current precision of 
the SDSS is sufficient to e xclude the level of assembly bias 
measured in ICroton et alJ (l2007h for color-defined samples, 
which in their model is driven primarily by central galaxies. 

Inconsistencies between observed properties of th e red 
galaxy population and the predictions of Cr oton et al.l ( 2007 ) 
have been reported elsewhere as well. Springel et alJ (l2005h 



show that the amplitude of the two-point correlation function 
of red galaxies in the Millennium Run semi-analytic galaxy 
population is much higher than observations at all scales. 
IWeinmann et alJ (2006). using a catalog of galaxy groups cre- 
ated from the SDSS, demonstr ate that the red gala xy frac- 
tion of groups is too high in the ICroton et alj (l2006bl) model. 
iBaldry et alJ (1200 6) investigate the red fraction as a function 
of local galaxy density in the SDSS, measuring a monotonic 
decrease in red fraction with decreasing density. Such a cor- 
relation can naturally result from the dependence of the halo 
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mass fu nction of local envir onment without invoking assem- 
bly bias (iBerlind et al.l2005h : red galaxies are primarily satel- 
lites in high mass halos (see Figure |7J)), and the frequ ency of 
such h alos correlates strongly with local density. Baldr v et all 
(12006) find that the correlation of red galaxy fraction with en- 
vironment is much steeper in the Croton et al. (2006b) model 
than measured in the SDSS. They show that this result is pri- 
marily due to the correlation between red central galaxies and 
environment. Although Baldry et al. define density by local 
galaxy density in redshift space u sing a neares t neighbor crite- 
rion, the relation they find in the lCroton et al.. (2006b) model 
between the red fraction of central galaxies and density is sim- 
ilar to the models tested in §3.3 with a sharp decrease in red 
central fraction by nearly an order of magnitude at densities 
below the mean. 

These discrepancies between semi-analytic models and ob- 
servations discrepancies off er insight into galaxy formation 
processes. The aspect of the ICroton et al.l (l2006bh model that 
most directly influences galaxy color is its treatment of AGN 
feeding and feedback, which heats the gas and halts star for- 
mation. In the model, this mechanism is correlated with envi- 
ronment to produce the color-dependent assembly bias. This 
work, and the papers listed above, suggest that a gas-heating 
mechanism less sensitive to halo environment will bring the 
models into better agreement with the clustering data. 

Regardless of the details of galaxy formation, it is well- 
established now that correlations exists between halo proper- 
ties and environment, especially for th e low-mass ha l os tha t 
contain M,.-51og/i - -19 galaxies. iGao & Whi"t3 (l2006h 
show that environment correlates with halo formation time, 
concentration, and spin for M < M*. Why then is the corre- 
lation with galaxy properties so weak? To produce the ob- 
served void statistics, the luminosity of a central galaxy must 
be largely uncorrected with halo properties other than mass, 
in the sense that the correlation must be significantly smaller 
than the scatter in at a given halo formation time or for- 
mation history. Additionally, central galaxy color must also 
be weakly correlated with halo formation. The amount of 
star formation required to make a red galaxy blue is relatively 
small, while the amount of time required for a blue galaxy 
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to passively evol ve into a red object can be < 1 Gyr (see 
iFaber et al.H2005l and refer ences therein). If the color distri- 
bution is determined mainly by the occurrence of recent star 
formation, galaxy colors may be a stochastic proc ess in bet- 
ter agreement with the assumptions of the HOD. Roia s et aP 
(2004, 200 5) find that void galaxies have higher specific star 
formation rates than galaxies in hig her-density environm ents. 
As with the color-density relation ( Berlind et alj (l2005b ). the 
correlation of star formation rate with environment may re- 
flect changes in the underlying halo mass function between 
low and high densities rather than a correlation with forma- 
tion history. 

The nature of voids has been an important question since 
their discovery in the first large galaxy redshi ft survey 
("Greg ory & Thompson! 119781: iKirshner et af] [19811) . Are 
voids truly empty of matter or just deficient in galaxies? Is a 
non-gravitational process required to explain their observed 
sizes? These questions have become better defined through 
convergence on a standard cosmological model and better un- 
derstanding of the relation between galaxies and dark matter 
halos. We find that the sizes and emptiness of observed voids 
are in excellent agreement with straightforward theoretical 
predictions. 
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